Prosody and Automatic Speech Recognition —- Why not yet a Success Story and where to go from here
نویسندگان
چکیده
We describe the different linguistic and paralinguistic functions of prosody, show how features can be computed that describe the prosodic marking of these functions, and how this knowledge can be used in an automatic speech understanding system. This is done in the context of the speech–to–speech translation system Verbmobil, where prosody is used to segment the user utterance and to find self repairs. We then go on to discuss, why most speech processing systems do not use prosodic information and end by showing some new trends in prosody research, namely the classification of emotion and the classification of “offtalk” (speaking aside).
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملThe Effects of Culture and Gender on the Recognition of Emotional Speech: Evidence from Persian Speakers Living in a Collectivist Society
This paper reports on a behavioral study that explores the role of culture and gender in the recognition of emotional speech in an under investigated cultural context (a collectivist society: i.e., Iran). Participants were asked to recognize the emotional prosody of a set of validated emotional vocal portrayals (including the five basic emotions). Findings of the experiment were then comp...
متن کاملUsing prosody to improve Mandarin automatic speech recognition
In this paper, these problems of how to model and train Mandarin prosody dependent acoustic model and how to decode input speech based on prosody dependent speech recognition system will be discussed. We use automatic prosody labeling methods to annotate syllable prosodic break type and stress type on continuous speech corpus, and utilize our proposed methods to train prosody dependent tonal sy...
متن کاملProsody for Mandarin speech recognition: a comparative study of read and spontaneous speech
In this paper, we present a comparative study between spontaneous speech and read Mandarin speech in the context of automatic speech recognition. We focus on analysis and modeling of prosodic features, based on a unique speech corpus that contains similar amounts of read and spontaneous speech data from the same group of speakers. Statistical analysis is carried out on tone contours and duratio...
متن کاملInferring stance in news broadcasts from prosodic-feature configurations
Speech conveys many things beyond content, including aspects of appraisal, feeling, and attitude that have not been much studied. In this work we identify 14 aspects of stance that occur frequently in radio news stories and that could be useful for information retrieval, including indications of subjectivity, immediacy, local relevance, and newness. We observe that newsreaders often mark their ...
متن کامل